The publication of scientific data by World Data Centers and the National Library of Science and Technology in Germany

نویسندگان

  • Jan Brase
  • Uwe Schindler
چکیده

In its 2004 report "Data and information", the International Council for Science (ICSU) strongly recommended a new strategic framework for scientific data and information. On an initiative from a working group from the Committee on Data for Science and Technology (CODATA), the German Research Foundation (DFG) has started the project “Publication and Citation of Scientific Primary Data” as part of the program “Information-infrastructure of network -based scientific-cooperation and digital publication” in 2004. Starting with the field of earth science, the German National Library of Science and Technology (TIB) is now established as a registration agency for scientific primary data as a member of the International DOI Foundation (IDF). 1 REGISTRATION OF SCIENTIFIC DATA Primary data related to geoscientific, climate, and environmental research is stored locally at those institutions which are responsible for their evaluation and maintenance. In addition to the local data provision, the TIB saves the URL where the data can be accessed including all bibliographic metadata. When data are registered, the TIB provides a DOI as a unique identifier. Digital Object Identifier (DOI) is a system for identifying content objects in the digital environment. DOIs are names assigned to any entity for use on digital networks. They are used to provide current information, including where they (or information about them) can be found on the Internet. Information about a digital object may change over time, including where to find it, but its DOI will remain stable. Any scientist working with this data is now able to cite the data in his work by its DOI. By this, scientific primary data is not exclusively understood as part of a scientific publication but has its own identity. If a scientist reads a publication where registered data is used, he might be interested in analysing the data under different aspects. He can now cite the data in his own publications using its DOI, referring to the uniqueness and separate identity of the original data. Since academic regard is often measured in so-called “citation-indexes” which count the number of citations of a scientist’s work, collecting data can therefore be accomplished as an important part of academic work. Because of the expected large amount of datasets that need to be registered, we have decided to distinguish between citable datasets on the collection level and core datasets on the item level. Core datasets receive their identifiers, but their metadata is not included in the library catalogue. The DOI guarantees the accessibility of this data, for example to refer to the data inside a publication. Only citable datasets, usually collections of or publications from core datasets, are included in the catalogue. 2 THE DIGITAL OBJECT IDENTIFIER To register the data, the TIB awards it with a DOI as a unique identifier. In May 200, the TIB became an official DOI Registration Agency. A DOI consist of two parts: a prefix and a suffix. For scientific data, a DOI looks like this: 10.1594/WDCC/IPCC_EH4_OPYC_SRES_B2_MM 10.1594 is the prefix and identifies that this DOI belongs to a scientific data set, registered at the TIB; WDCC stands for the respective research institute (World Data Center for Climate, in this case), followed by the internal name of the data record at the WDCC. Data Science Journal, Volume 5, 19 October 2006 206 A DOI can be resolved in every web browser worldwide, using the Handle system from the Cooperation for National Research Initiatives (CNRI). A Handle server, for example, is installed at the webpage of the International DOI Foundation (IDF). Resolving of this DOI is therefore possible by using the URL: http://dx.doi.org/10.1594/WDCC/IPCC_EH4_OPYC_SRES_B2_MM Furthermore, it is possible to install a free plug-in into the Internet Explorer to resolve this DOI by typing it into the address bar of the browser. The DOI registration at the TIB works on a cost-recovery basis, enabling a persistent registration of scientific results for less than half a dollar. 3 SCIENTIFIC DATA IN THE LIBRARY CATALOGUE Scientific data is now accessible via the online library catalogue of the TIB (see fig. 1). The catalogue data for the content is based on the application profile of the STD-DOI project for scientific data. The profile includes all metadata identified in the ISO 690-2 obligatory for the citing of electronic media, together with Dublin Core based standard metadata attributes. A detailed analysis of the metadata used can be found in Brase (2004). Fig. 1 A published dataset as a query result in the online catalogue of the TIB The TIB offers an XML-based web service infrastructure that allows the data providers to include the registration and publication of scientific data into their infrastructure. Data Science Journal, Volume 5, 19 October 2006 207 4 AN EXAMPLE OF WORKFLOW AT THE WORLD DATA CENTERS At WDC-MARE, the web service client is embedded into the metadata publishing workflow of the PANGAEA Publishing Network for Geoscientific & Environmental Data. After inserting or updating a dataset in PANGAEA, the import client queues background services which keep the XML metadata repository up to date (see fig. 2). First, these background services marshal the metadata into an internal XML schema. This schema reflects the PANGAEA database structures and is optimized for simple marshalling of database records and transformation into other formats. With this software, the underlying database structure can be easily mapped to a given XML schema. Because of the relational database structure, a change in one relational item can lead to a change in several XML files. Database update triggers fill the background services queue with changes for the related tables. This keeps the "flat" XML table in synchronization with the relational data. The internal XML is stored as a binary large object (blob) in a database table linked to the datasets. The full text search engine, however, provides fast search access to the metadata. These XML blobs can be transformed into various other schemas with XSLT on the fly.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

مطالعه وضعیت نشر مجلات علمی در ایران

Purpose: This study was performed to investigate the publication status of Iranian scientific journals with regard to scope, organizational affiliation, scientific level, publication language, electronic publication, and full-text access, either free or paid. Methodology: The evaluative research method has been used in this study. A checklist was employed for collection of data related to scop...

متن کامل

Investigating the Relationship between Social Capital and Knowledge Sharing at Iran’s National Information Centers

Background and Aim: The purpose of this study was to investigate the relationship between social capital and knowledge sharing at national information centers in Iran. Method: This applied research was carried out using two questionnaires and a checklist. Data were collected from all the managers, but stratified random sample of staff members of three:  main national information centers (Nation...

متن کامل

science and technology diplomacy model based on the economic complexity

Background and Aim: The history of the evolution of human societies indicates the important and civilizing role of science and technology in the formation of economic, social, cultural, and political relations. Along with the increasing speed and development of science and technology, economic, social, cultural, and political concepts and approaches have also changed and played a more colorful ...

متن کامل

University education strategies wisdom-based University

Introduction: Identifying effective educational strategies in developing the skills required by academics with an Islamic approach is one of the basic and urgent needs of the country's academic community. In this regard, the present study uses a descriptive-analytical method based on library information and documents, while conceptualizing the term Wisdom and wisdom based university, with the a...

متن کامل

Study of the foundation, models and issues of research data curation and management in scientific and academic environments

Background and Aim: The purpose of this paper is to study, identifying and discuss the foundation and concepts, models and frameworks, dimensions and challenges of research data curation and management in scientific and academic environments. Method: This article is a review article and library method was used to collect scientific and research texts in this field. In this research, external an...

متن کامل

اشتراک‌گذاری داده‌های پژوهشی: رویکردهای ملی و بین‌المللی

Information and Communication Technologies (ICTs) have significant role in producing research data in different scientific fields. These data not only lead to generate fields based on research data, but also cause to produce a new paradigm or approach in research which is called Fourth Paradigm or Data Intensive Researches. These researches are based on data sharing by researchers, organization...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Data Science Journal

دوره 5  شماره 

صفحات  -

تاریخ انتشار 2006